I-vector based speaker recognition using advanced channel compensation techniques

نویسندگان

Ahilan Kanagasundaram

David Dean

Sridha Sridharan

Mitchell McLaren

Robbie Vogt

چکیده

This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques: (a) weighted maximum margin criterion (WMMC), (b) source-normalized WMMC (SN-WMMC), (c) weighted linear discriminant analysis (WLDA) and (d) sourcenormalized WLDA (SN-WLDA) have been investigated. We show that, by extracting the discriminatory information between pairs of speakers as well as capturing the source variation information in the development i-vector space, the SN-WLDA based cosine similarity scoring (CSS) i-vector system is shown to provide over 20% improvement in EER for NIST 2008 interview and microphone verification and over 10% improvement in EER for NIST 2008 telephone verification, when compared to SN-LDA based CSS i-vector system. Further, score-level fusion techniques are analyzed to combine the best channel compensation approaches, to provide over 8% improvement in DCF over the best single approach, (SN-WLDA), for NIST 2008 interview/ telephone enrolment-verification condition. Finally, we demonstrate that the improvements found in the context of CSS also generalize to state-of-the-art GPLDA with up to 14% relative improvement in EER for NIST SRE 2010 interview and microphone verification and over 7% relative improvement in EER for NIST SRE 2010 telephone verification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PLDA based speaker verification with weighted LDA techniques

This paper investigates the use of the dimensionality-reduction techniques weighted linear discriminant analysis (WLDA), and weighted median fisher discriminant analysis (WMFD), before probabilistic linear discriminant analysis (PLDA) modeling for the purpose of improving speaker verification performance in the presence of high inter-session variability. Recently it was shown that WLDA techniqu...

متن کامل

Channel compensation for SVM speaker recognition

One of the major remaining challenges to improving accuracy in state-of-the-art speaker recognition algorithms is reducing the impact of channel and handset variations on system performance. For Gaussian Mixture Model based speaker recognition systems, a variety of channel-adaptation techniques are known and available for adapting models between different channel conditions, but for the much mo...

متن کامل

Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker Recognition

State-of-the-art session variability compensation for speaker recognition are generally based on various linear statistical models of the Gaussian Mixture Model (GMM) mean super-vectors, while frontend features are only processed by standard normalization techniques. In this study, we propose a front-end channel compensation frame-work using mixture-localized linear transforms that operate befo...

متن کامل

Between-Class Covariance Correction For Linear Discriminant Analysis in Language Recognition

Linear Discriminant Analysis (LDA) is one of the most widely-used channel compensation techniques in current speaker and language recognition systems. In this study, we propose a technique of Between-Class Covariance Correction (BCC) to improve language recognition performance. This approach builds on the idea of WithinClass Covariance Correction (WCC), which was introduced as a means to compen...

متن کامل

Dataset shift in PLDA based speaker verification

Dataset shift is a problem widely studied in the field of speaker recognition. Among the different types of dataset shift, covariate shift is the most common one in real scenarios. Traditional solutions for the problem of covariate shift have been developed in the context of channel and session variability, and make use of large datasets to train models for channel/session compensation. However...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Computer Speech & Language

دوره 28 شماره

صفحات -

تاریخ انتشار 2014

I-vector based speaker recognition using advanced channel compensation techniques

نویسندگان

چکیده

منابع مشابه

PLDA based speaker verification with weighted LDA techniques

Channel compensation for SVM speaker recognition

Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker Recognition

Between-Class Covariance Correction For Linear Discriminant Analysis in Language Recognition

Dataset shift in PLDA based speaker verification

عنوان ژورنال:

اشتراک گذاری